Clinical nomogram prediction model to assess the risk of prolonged ICU length of stay in patients with diabetic ketoacidosis: a retrospective analysis based on the MIMIC-IV database

Background The duration of hospitalization, especially in the intensive care unit (ICU), for patients with diabetic ketoacidosis (DKA) is influenced by patient prognosis and treatment costs. Reducing ICU length of stay (LOS) in patients with DKA is crucial for optimising healthcare resources utilization. This study aimed to establish a nomogram prediction model to identify the risk factors influencing prolonged LOS in ICU-managed patients with DKA, which will serve as a basis for clinical treatment, healthcare safety, and quality management research. Methods In this single-centre retrospective cohort study, we performed a retrospective analysis using relevant data extracted from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database. Clinical data from 669 patients with DKA requiring ICU treatment were included. Variables were selected using the Least Absolute Shrinkage and Selection Operator (LASSO) binary logistic regression model. Subsequently, the selected variables were subjected to a multifactorial logistic regression analysis to determine independent risk factors for prolonged ICU LOS in patients with DKA. A nomogram prediction model was constructed based on the identified predictors. The multivariate variables included in this nomogram prediction model were the Oxford acute severity of illness score (OASIS), Glasgow coma scale (GCS), acute kidney injury (AKI) stage, vasoactive agents, and myocardial infarction. Results The prediction model had a high predictive efficacy, with an area under the curve value of 0.870 (95% confidence interval [CI], 0.831–0.908) in the training cohort and 0.858 (95% CI, 0.799–0.916) in the validation cohort. A highly accurate predictive model was depicted in both cohorts using the Hosmer–Lemeshow (H-L) test and calibration plots. Conclusion The nomogram prediction model proposed in this study has a high clinical application value for predicting prolonged ICU LOS in patients with DKA. This model can help clinicians identify patients with DKA at risk of prolonged ICU LOS, thereby enhancing prompt intervention and improving prognosis. Supplementary Information The online version contains supplementary material available at 10.1186/s12871-024-02467-z.

Clinical nomogram prediction model to assess the risk of prolonged ICU length of stay in patients with diabetic ketoacidosis: a retrospective analysis based on the MIMIC-IV database Background Diabetic ketoacidosis (DKA) occurs when insulin secretion is insufficient to inhibit the production of blood ketone bodies.It is the most common acute life-threatening complication in patients with diabetes mellitus and is one of the leading causes of death in this patient population [1,2].Although the mortality rate of patients with DKA is decreasing, the hospitalisation rate remains high [3,4] especially as the length of stay (LOS) for patients with DKA requiring further treatment in the intensive care unit (ICU) continues to rise.This trend places a great burden on healthcare resources worldwide.ICU costs account for a large part of the total cost of hospitalisation, posing a great challenge to the global economy [5,6].
The length of hospitalisation in patients with DKA may be influenced by their prognosis.Ata et al. [7] showed that patients with DKA with longer hospitalisation periods are at a higher risk of comorbidities, leading to a poor prognosis, further increasing the economic burden, and creating a vicious circle.Patients with DKA treated in general wards have been reported to exhibit no significant difference in mortality rate but had lower treatment costs compared to patients with DKA admitted to the ICU [8,9].Therefore, the LOS of patients with DKA, especially in the ICU, is closely related to patient prognosis and treatment cost.
Previous studies have shown that several factors influence LOS in patients with DKA.However, a highly sensitive and specific predictive model to assess ICU LOS in patients with DKA is lacking [7,10].Therefore, this study aimed to develop a nomogram prediction model to identify the risk factors influencing prolonged ICU LOS in patients with DKA.This model may serve as a basis for clinical treatment, healthcare safety, and quality management studies.

Study design and data source
We performed a retrospective analysis using all relevant data extracted from the MIMIC-IV database [11].This database comprised patient-related data collected in the ICUs of the Beth Israel Deaconess Medical Centre between 2008 and 2019.This public-access database is supported by the Department of Medicine at the Beth Israel Deaconess Medical Centre and the Computational Physiology Laboratory at Massachusetts Institute of Technology (MIT) and is freely accessible to any qualified PhysioNet user.CFJ received a certificate (No: 43,529,529) and permission to use the MIMIC-IV database after completing the web-based course.The Institutional Review Boards of MIT (Cambridge, MA, USA) and the Beth Israel Deaconess Medical Centre (Boston, MA, USA) approved the data collection and use of MIMIC-IV for research purposes and granted a waiver of informed consent.

Participant selection criteria
In this study, all patients admitted to the ICU who met the diagnostic criteria for DKA were included following the ICD-9/10 diagnostic codes in the database [1].The diagnostic criteria for DKA were as follows: (i) Glucose > 13.8 mmol/L, (ii) positive urine or serum ketones positive or β-hydroxybutyrate > 3 mmol/L, and (iii) arterial or venous pH < 7.3, bicarbonate < 18 mmol/L and anion gap > 10-12 mmol/L [1].
Patients with repeat ICU admissions, in-hospital deaths, and those admitted to the ICU for less than 24 h were excluded.A total of 669 patients with DKA were randomly assigned in a 7:3 ratio to a training cohort (n = 464) for nomogram model development and a validation cohort (n = 205) for internal validation of the nomogram model's performance.The study population enrolment flowchart is presented in Fig. 1.

Data extraction and definition of terms
Several variables were extracted from the database, including patient demographics, vital signs, comorbidities, laboratory indicators, scoring systems, and medical interventions.All data were collected within 24 h of ICU admission.Considering that several variables were measured multiple times, the worst values of laboratory variables recorded within 24 h of ICU admission were used for analysis and included in the predictive model.
ICU LOS was defined as the period from day 1 of ICU admission to the day before transfer from the ICU.Prolonged ICU LOS was defined as an ICU LOS ≥ 75th percentile (i.e., ≥ 75 h) of ICU LOS for all patients enrolled in this study.The cohorts were divided into the normal and prolonged groups according to whether the ICU LOS was ≥ 75 h.

Statistical analysis
Statistical analyses were performed using R version 4.3.1 (R Foundation for Statistical Computing, Vienna, Austria).Normally distributed measures were expressed as mean ± standard deviation, non-normally distributed measures as medians and quartiles, and count data as frequencies and percentages.The unpaired t-test was used to compare group values that conformed to a normal distribution, the Mann-Whitney U test was used to compare group values that did not conform to a normal distribution, and the chi-square test was used to compare categorical variables.Parameters with more than 20% missing values were excluded from the analysis.The missing values for all extracted variables are listed in Supplementary Table 1.Values missing for other parameters were filled in using multiple imputations with the 'mice' package of R software.First, the R 'caret' package was used to randomly divide the 669 patients with DKA into a training set with 464 participants and a validation set with 205 participants for external validation, conforming to the theoretical ratio of 7:3.Least absolute shrinkage and selection operator (LASSO), a shrinkage and variable selection method for linear regression models were performed using the 'glmnet' package.The 'rms' package was then used to develop the nomogram diagram based on a multivariable logistic regression analysis.The analysis was used to construct a predictive model by introducing the features selected in the LASSO regression model.The receiver operating characteristic (ROC) curves were plotted, and the area under the curve (AUC) was calculated using the 'pROC' package.These curves and calculations were used to assess the discriminatory ability of nomograms.We used the rms package to draw and calculate the calibration curves via the Hosmer-Lemeshow test.These curves were employed to evaluate the calibration of nomograms.
For the assessment of clinical practicability based on net benefit under various threshold probabilities, decision curve analysis (DCA) was conducted using the 'rmda' package.A P-value of < 0.050 was considered statistically significant.

Patient clinical characteristics
A total of 669 patients with DKA were included in the analysis, of which 464 were in the training cohort and 205 in the validation cohort.The training and validation cohorts comprised 236 and 128 male patients, respectively.In both cohorts, the DKA stage was predominantly mild (59.3% in the training group and 70.2% in the validation group), and type 1 diabetes mellitus (T1DM) was the predominant diagnosis.The training cohort Fig. 1 The flowchart of the study showed slightly higher chloride levels than the validation cohort.The training cohort had slightly lower bicarbonate, potential of hydrogen, and red blood cell distribution widths than the validation cohort.There were no significant differences between the training and validation cohort patients regarding the other variables (P > 0.050).These results justify the use of training and validation cohorts.Detailed clinical characteristics of the patients are listed in Table 1.

Variable selection
Based on the demographics, vital signs, medical history, laboratory parameters, scoring system, and patients' treatments in the training cohort, six predictor variables with non-zero coefficients were identified out of the initial 60 variables using LASSO regression analysis (Fig. 2).Vertical lines were plotted at the minimum value of λ (λ = 0.021) and the value of 1 standard error (SE) from the minimum value (λ = 0.071).At the point where log(λ) = -1.150,six non-zero coefficients were identified as the most appropriate predictor variables in the LASSO regression model.The predictor variables included Oxford acute severity of illness score (OASIS), Glasgow coma scale (GCS), Acute kidney injury (AKI) stage, vasoactive agents, and myocardial infarction.

Construction of nomogram prediction model
A multifactorial logistic regression model was constructed using the six predictor variables selected as independent variables using LASSO regression analysis (Fig. 2).The results revealed OASIS, GCS, AKI stage, vasoactive agents, and myocardial infarction as the risk factors for ICU LOS prolongation in patients with DKA (P < 0.05) (Table 2).A nomogram predicting the individual probability of prolonged ICU LOS in patients with DKA was constructed using the predictor variables.The nomogram was used to score the corresponding values of each variable, and subsequently, the scores of all variables were summed to obtain the total score.A vertical line was drawn downward according to the total score to indicate the estimated probability of prolonged ICU LOS in patients with DKA (Fig. 3).

Discriminatory ability of the nomogram
The discriminatory ability of the nomogram was assessed by calculating the AUC and plotting the ROC curve for the predictive model.The AUC for the training cohort was 0.870 (95% confidence interval [CI], 0.831-0.908),with an optimal cut-off value of 0.221.In the validation cohort, the AUC was 0.858 (95% CI, 0.799-0.916)with an optimal cut-off value of 0.207.The results showed a relatively positive AUC in both cohorts, indicating that the nomogram prediction model has a good discriminatory ability (Fig. 4).

Accuracy of the nomogram
The Hosmer-Lemeshow test showed a good fit (P = 0.715 for the training cohort and P = 0.373 for the validation cohort), indicating that the predicted probability of the nomogram was consistent with the actual probability, demonstrating good calibration.In addition, calibration curves for both the training and validation cohorts showed moderate agreement, and the nomogram had a good calibration ability (Fig. 5).

Clinical usefulness of the nomogram
The clinical usefulness of the nomogram prediction model was assessed using DCA.The DCA for the nomogram was conducted in both the training and validation cohorts.The horizontal axis, indicating no one received the intervention, resulted in a net benefit of zero.The oblique line represents a scenario where all participants received the intervention.In the training cohort, predicted probability thresholds were set at 5-85%, with a net benefit ranging from 4 to 27%.In the validation cohort, predicted probability thresholds were set at 4-99%, with a net benefit ranging from 1 to 27%.Within this range, the nomogram's net benefit was significantly higher than that of the two extreme cases, regardless of whether patients received clinical intervention (Fig. 6).

Discussion
We constructed a nomogram based on the MIMIC-IV database to predict the risk of prolonged ICU LOS in patients with DKA.The nomogram robustness was further enhanced by screening for multiple factors using LASSO regression to avoid covariance and overfitting [12].Subsequently, multifactorial logistic regression analysis was performed on the selected indicators.Ultimately, five key indicators, namely OASIS, GCS, AKI stage, vasoactive agents, and myocardial infarction were identified as predictors in this model.During model validation, the AUC of our nomogram was determined to be 0.870 and 0.858 in the training and validation cohorts, respectively, indicating a satisfactory predictive performance.Calibration plots demonstrated a satisfactory agreement between the actual and predicted values.Furthermore, the nomogram demonstrated good clinical utility through DCA.The nomogram developed in this study predicts the possibility of prolonged ICU LOS in patients with DKA based on medical history information, clinical investigations, and medications.Besides, it provides valuable clinical references for developing strategies to prevent and control prolonged ICU LOS in patients with DKA.
The OASIS score consists of 10 easily accessible basic parameters primarily used to assess the prognosis of critically injured patients [13].In line with our findings, patients with DKA with high OASIS scores had longer    Changes in the level of consciousness are important clinical symptoms and criteria for evaluating disease severity in patients with DKA.Cerebral oedema can occur in patients with DKA due to the combined effects of various factors, such as severe water loss, circulatory disorders, increased osmotic pressure, and cerebral cell hypoxia, causing central nervous system dysfunction and varying degrees of impaired consciousness [14,15].Fluid resuscitation is the initial step in relieving circulatory disturbances in patients with DKA.However, excessive fluid resuscitation can exacerbate cerebral oedema, thus titration of fluid resuscitation is crucial [16,17].Additionally, hypertonic therapy helps in transferring intracranial water into the bloodstream, ameliorating cerebral oedema with minimal impact on neurological outcomes [18].Finally, balanced oxygen therapy proves effective in improving neurological outcomes by addressing ischemia and hypoxia in cerebral oedematous tissues thereby suppressing neuroinflammation [19].Therefore, titration of fluid resuscitation, hypertonic therapy, and balanced oxygen therapy are essential to avoid cerebral oedema.GCS is one of the most commonly used clinical tools for assessing consciousness.Not incidentally, GCS was found to be a protective factor for prolonged ICU LOS in patients with DKA in our study (odds ratio: 0.83 (0.72-0.95), P = 0.005).Patients with a lower GCS indicate critical disease progression and tend to require longer ICU treatment duration.
We also examined the effect of the primary disease on ICU LOS in patients with DKA and found that myocardial infarction was an independent risk factor for prolonged ICU LOS.Issa et al. [20] found that myocardial infarction and the incidence of DKA are closely linked, with uncontrolled hyperglycaemia leading to elevated blood catecholamine levels, further exacerbating oxidative stress and causing endothelial and microvascular dysfunction [21].Simultaneously, patients with myocardial infarction often experience hemodynamic and hormonal disturbances, further contributing to the development of DKA in patients with diabetes [22].DKA and myocardial infarction can mutually influence and trigger each other.Determining the sequence of occurrence between these conditions is often challenging.Therefore, physicians managing patients with DKA must be vigilant of the potential presence of myocardial infarction, as it could contribute to prolonged ICU LOS in these patients.
We found that patients with comorbidities of AKI stages II-III tended to experience longer ICU LOS.AKI is a common complication in patients with DKA with high mortality and morbidity rates, especially in critically ill patients [23].Renal ischaemia-reperfusion injury is a common cause of AKI, and permeability diuresis is a major risk factor for AKI in patients with DKA [1].AKI is characterized by a sudden deterioration of renal function and a decrease in  urine output, leading to disturbances in electrolyte and acidbase metabolism, volume overload, and damage to other organ systems as a result of these disturbances [24].Thus, AKI progression is associated with DKA severity, further contributing to prolonged hospitalisation.In addition, similar to our findings, Fan et al. [25] found that a lower GCS score was also an independent risk factor for inducing AKI in patients with DKA, further proving that GCS is an important indicator in patients with DKA.Consequently, it underscores the need for heightened clinical vigilance toward changes in patient's consciousness and the imperative for timely intervention during the early stages of disease progression.
Our study also found that the use of vasoactive agents was an independent risk factor for prolonged ICU LOS in patients with DKA, consistent with findings in other diseases [26].The use of vasoactive agents suggests a state of hypoperfusion and hemodynamic instability [27], requiring prolonged ICU monitoring compared to patients not using such agents, ultimately resulting in prolonged ICU LOS.This study has some limitations.Firstly, this was a singlecentre retrospective study, with a relatively small sample size, leading to potential selection bias and a less representative sample.Although the stability of our nomogram was tested by internal validation, further external validation across broader demographic groups is warranted based on our data.Secondly, owing to > 20% missing data in the database, our study did not include several potentially important factors, including glycated haemoglobin (HbA1c), glycaemic lability index [28,29], albumin nutritional scores, urinary ketones, and causative factors for ketoacidosis.Thirdly, this study was limited by the current availability of databases.Although we found that DKA stage, hypoglycaemia, and mechanical ventilation may prolong ICU LOS in patients with DKA, these factors were not included in the

Conclusion
The nomogram prediction model, constructed based on the five independent risk factors identified in this study, demonstrated good predictive efficacy in assessing the risk of prolonged ICU LOS in patients with DKA.After calibration to ensure reliable predictive accuracy, the model exhibited good clinical utility, as evidenced by DCA analysis.This can aid both patients and clinicians in determining prognosis and making informed clinical decisions.However, the recognized limitations underscore the need for ongoing research to explore additional influential factors, including HbA1c and glycaemic liability index.A comprehensive understanding of these variables will contribute to refining predictive modes and enhancing their effectiveness in guiding clinical decisions for optimal patient outcomes.

Fig. 3 Fig. 2
Fig. 3 Nomogram for predicting prolonged ICU LOS in patients with DKA

Table 1
Summary statistics of patient clinical characteristics

Table 1 (
continued)ICU stays, suggesting that prolonged hospitalisation may be associated with poor prognosis.

Table 1 (
Abbreviations BMI Body mass index, DM Diabetic mellitus, T1DM Type 1 diabetic mellitus, T2DM Type 2 diabetic mellitus, RR Respiratory rate, HR Heart rate, MAP Mean arterial pressure, UO Urine volume, GCS Glasgow coma scale, SOFA Sequential organ failure assessment, LODS Logistic organ dysfunction system, OASIS Oxford acute severity of illness score, SAPSII Simplified acute physiology score II, APSIII Acute physiology score, CCI Charlson comorbidity index, WBC White blood cell, RDW Red blood cell distribution width, AG Anion gap, POP Plasma osmotic pressure, PH Potential hydrogen, BUN Blood urea nitrogen, HHS Hyperosmolar hyperglycemia state, AKI Acute kidney injury, CRRT Continuous renal replacement therapy continued)

Table 2
The result of Multivariate logistic analysis based on LASSO regression result Abbreviations LASSO Least absolute shrinkage and selection operator, OR odds ratio, GCS Glasgow coma scale, SOFA Sequential organ failure assessment, AKI Acute kidney injury